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Abstract 


In  this  paper  we  investigate  the  application  of  Krylov  methods  to  compress¬ 
ible  flows,  and  the  effect  of  implicit  boundary  conditions  on  the  implicit  solution 
of  nonlinear  problems.  Two  defect-correction  procedures,  namely,  Approximate 
Factorization  (AF)  for  structured  grids,  and  ILU/GMRES  for  general  grids  are 
considered.  Also,  considered  here,  is  Newton-Krylov  matrix-free  methods  that  we 
combine  with  the  use  of  mixed  discretization  schemes  in  the  implicitly  defined  Ja¬ 
cobian  and  its  preconditioner.  Numerical  experiments  that  show  the  performance 
of  our  approaches  are  then  presented. 
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1  Introduction 


The  implicit  discretization  of  the  compressible  flows  leads  to  a  large  sparse  linear 
system  which  needs  to  be  solved  at  each  time  step.  In  the  derivation  of  this  system 
one  often  uses  a  defect- correction  procedure,  in  which  the  left-hand  side  of  the 
system  is  discretized  using  a  lower  order  approximation  than  for  the  right-hand 
side.  This  is  due  to  storage  considerations  and  computational  complexity,  and 
also  to  the  fact  that  the  resulting  lower  order  matrix  is  better  conditioned  than  the 
higher  order  matrix.  The  resulting  methods  are  only  moderately  implicit.  In  the 
case  of  structured,  body-fitted  grids,  the  linear  system  can  easily  be  solved  using 
approximate  factorization  (AF),  which  is  among  the  most  widely  used  methods 
for  such  grids.  For  unstructured  grids,  such  techniques  are  no  longer  valid,  and 
the  system  is  solved  using  direct  or  iterative  methods.  Because  of  the  prohibitive 
computational  costs  and  large  memory  requirements  for  the  solution  of  compress¬ 
ible  flows,  iterative  methods  are  preferred.  In  these  defect-correction  methods, 
which  are  of  practice  in  most  CFD  computer  codes,  the  mismatch  in  the  right  and 
left  hand  side  operators  together  with  explicit  treatment  of  the  boundary  condi¬ 
tions,  lead  to  severely  limited  CFL  number,  which  results  in  a  slow  convergence  to 
steady  state  aerodynamic  solutions.  Many  authors  have  tried  to  replace  explicit 
boundary  conditions  with  implicit  ones  (see  for  instance  [19], [15],  and  [8]).  They 
showed  that  high  CFL  number  can  be  used,  however  no  clear  advantages  in  terms 
of  CPU  time  as  compared  to  explicit  boundary  conditions  have  been  drawn. 

We  investigate  here  defect-correction  procedures  based  on  Krylov  methods; 
more  particularly  we  study  the  ILU/GMRES  methods  together  with  implicit 
treatment  of  the  boundary  conditions.  We  show  in  particular  that,  in  the  context 
of  Krylov  methods  improvements  in  terms  of  convergence  rate  can  be  achieved 
through  the  use  of  implicit  boundary  conditions  as  compared  to  explicit  ones. 
However,  the  attractive  Newton’s  method  convergence  property  cannot  be  ap¬ 
proached  because  of  the  mismatch  of  the  right  and  left  hand  side  operators. 
Therefore  we  propose  to  use  Newton-Krylov  matrix-free  (see  [3])  methods  com¬ 
bined  with  mixed  discretization  in  the  implicitly  defined  Jacobian-Preconditioner. 
Numerical  experiments  that  show  the  performance  of  our  approach  are  then  pre¬ 
sented. 

In  the  next  section,  we  describe  the  Newton-Krylov  methodology  together 
with  mixed  discretization.  We  present,  in  the  section  3,  the  Euler  solver.  The 
description  of  the  implicit  boundary  conditions  is  also  given  in  the  section  3. 
Numerical  experiments  are  presented  in  the  section  4.  The  last  section  is  devoted 
to  some  remarks  and  extensions. 
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2  Newton-Krylov  Methods 

Newton-Krylov  methods  first  proposed  by  Brown  and  Saad  [3],  have  been  inves¬ 
tigated  for  compressible  Euler  and  Navier-Stokes  equations  using  unstructured 
grids  in  [16],  [17],  and  [7],  and  for  structured  grids  in  [4],  and  [5]. 

In  [16]  and  [17],  the  author  has  applied  the  matrix-free  Newton-Krylov  method¬ 
ology  to  both  the  transonic  and  supersonic  compressible  Navier-Stokes  flows.  In 
[4]  and  [5],  the  authors  have  studied  a  convection-diffusion  model  problem,  full 
potential  flows  and  the  transonic  compressible  Euler  flows.  They  have  also  pro¬ 
posed  and  studied  the  Newton-Krylov-Schwarz  methodology.  An  application  to 
incompressbile  flows  is  reported  in  [9]. 

Newton-like  methods,  together  with  fully  implicit  linear  solvers  allow,  in  prin¬ 
ciple,  a  more  rapid  asymptotic  approach  to  steady  states,  f(u)  —  0,  than  do 
time-explicit  methods  or  semi-implicit  methods  based  on  defect  correction.  Strict 
Newton  methods  have  the  disadvantage  of  requiring  solutions  of  linear  systems 
of  equations  based  on  the  Jacobian,  /„(«),  of  the  true  steady  nonlinear  residual 
and  are  often  impractical  in  several  respects: 

1.  Their  quadratic  convergence  properties  are  realized  only  asymptotically. 
In  early  stages  of  the  nonlinear  iteration,  continuation  or  regularization  is 
typically  required  in  order  to  prevent  divergence. 

2.  Some  popular  discretizations  (e.g.,  using  limiters)  of  f(u )  are  nondifferen- 
tiable,  leaving  the  Jacobian  undefined  in  a  continuous  sense. 

3.  Even  if  fu(u )  exists,  it  is  often  inconvenient  or  expensive  to  form  either 
analytically  or  numerically,  and  may  be  inconvenient  to  store. 

4.  Even  if  the  true  Jacobian  is  easily  formed  and  stored,  it  may  have  a  bad 
condition  number. 

5.  The  most  popular  family  of  preconditioners  for  large  sparse  Jacobians  on 
structured  or  unstructured  two-  or  three-dimensional  grids,  the  incomplete 
factorizations,  is  difficult  to  parallelize  efficiently. 

In  this  paper  we  examine  how  points  (1),  (3)  and  (4)  may  be  addressed  through 
Newton-Krylov  methods.  For  point  (2)  we  refer  to  [18],  and  for  point  (5)  we  refer 
to  [4]  and  [5]. 

The  memory  requirements  and  the  computational  complexity  for  the  higher- 
order  matrix  representation,  whether  by  analytical  or  numerical  means,  are  pro¬ 
hibitive.  In  this  context,  matrix-free  Newton-Krylov  methods,  in  which  the  action 
of  the  Jacobian  is  required  only  on  a  set  of  given  vectors  are  natural.  To  solve 
the  nonlinear  system  f(u )  =  0,  given  u°,  let  ul+1  =  ul  +  Xl8ul ,  for  /  =  0, 1, . . . , 
until  the  residual  is  sufficiently  small,  where  8ul  approximately  solves  the  Newton 
correction  equation  J(ul)8ul  =  — f(ul ),  and  parameter  X1  is  selected  by  some  line 
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search  or  trust  region  algorithm  [6].  Krylov  methods,  such  as  the  method  of  con¬ 
jugate  gradients  for  symmetric  positive  definite  systems  or  GMRES  for  general 
nonsingular  systems,  find  the  best  approximation  of  the  solution  in  a  relatively 
small-dimensional  subspace  that  is  built  up  from  successive  powers  of  the  Jaco¬ 
bian  on  the  initial  residual.  The  Krylov  solver  used  throughout  this  paper  is 
GMRES  [13]. 

The  action  of  Jacobian  J  on  an  arbitrary  Krylov  vector  w  can  be  approximated 
by 

J{ul)w  ~  -  [f(ul  +  etc)  —  /(«')  . 

Finite-differencing  with  e  makes  such  matrix-free  methods  potentially  much  more 
susceptible  to  finite  word-length  effects  than  ordinary  Krylov  methods  [9]. 

The  selection  of  an  optimal  parameter  e,  is  non  trivial.  If  e  is  too  small  then 
the  rounding  errors  made  in  the  numerator  are  amplified  by  a  factor  of  order  ^ 
which  leads  to  an  inaccurate  result.  If  on  the  other  hand  e  is  too  large  then  the 
approximation  of  J(ul)w  will  be  poor.  Any  reasonable  choice  of  e  should  attempt 
to  reach  a  compromise  between  these  two  difficulties.  The  technique  for  choosing 
the  scalar  e  we  use  here  is: 

e  =  •  max{|(u\u)|,typu'M}- 

HI 

where  |t?|  =  (|ui|, ...,  |u„|)T,  and  typu  is  given  value  depending  on  u  and  the 
problem  to  be  solved.  We  note  here  that  GMRES  may  have  an  advantage  over 
other  Krylov  methods  in  the  matrix-free  context  in  that  the  vectors  v  that  arise  in 
GMRES  have  unit  two-norm,  but  may  have  widely  varying  scale  in  other  Krylov 
methods  for  nonsymmetric  systems.  Right  preconditioning  spoils  the  perfect  unit 
two-norm.  For  an  extended  discussion  of  matrix-free  applications  of  the  Jacobian 
in  the  Krylov  context,  see  [3]. 

Steady  aerodynamics  applications  require  the  solution  of  linear  systems  that 
lack  strong  diagonal  dominance,  so  it  is  important  to  verify  that  properly-scaled 
matrix-free  methods  can  be  employed  in  this  context. 

Although  the  matrix-free  method  is  attractive  because  it  does  not  form  the 
matrix  explicitly,  the  matrix  is  still  required  for  preconditioning  purposes.  In  [16], 
[17],  and  [7]  the  authors  settled  for  a  compromise  that  uses  a  block-diagonal  pre¬ 
conditioner.  However,  most  preconditioners  require  the  matrix-explicitly.  This 
is  true  for  ILU  preconditioner.  Therefore,  we  form  only,  as  in  defect-corection 
method  only  the  matrix  of  a  lower  order  system  to  precondition  the  consistent 
higher  order  system.  An  approximation  to  the  Jacobian  can  be  used  to  precon¬ 
dition  the  Krylov  process.  Examples  are: 

1.  the  Jacobian  of  a  lower-order  discretization, 

2.  the  Jacobian  of  a  related  discretization  that  allows  economical  analytical 
evaluation  of  elements,  and 
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3.  domain-parallel  preconditioners  composed  of  Jacobian  blocks  on  subdo¬ 
mains  of  the  full  problem  domain. 

We  consider  here  only  cases  1  and  2.  Case  3  can  be  combined  with  any  of  the 
split-discretization  techniques  (cases  1-2),  in  principle  and  is  studied  in  [4]  and 

[51- 

Left  preconditioning  of  the  Jacobian  with  an  operator  B  1  can  be  accommo¬ 
dated  via 

B~xJ(ul)w  «  -  B~xf((ul  +  ew))  -  f(v! )  , 
where  /(id)  =  B~lf{ul)  is  stored  once,  and  right  preconditioning  via 

J(ul)B~xw  «  -  [f((ul  +  eB~xw))  -  f(ul )  . 

Right  preconditioning  is  preferable  when  the  focus  is  on  comparing  different  pre¬ 
conditioners,  since  the  residual  norm  measured  as  a  by-product  in  GMRES  and 
used  in  the  termination  test  is  independent  of  any  right  preconditioning.  On  the 
other  hand,  any  left  preconditioning  changes  the  residual  norm  estimate  available 
as  a  by-product  in  GMRES.  Left  preconditioning  may  be  preferable  when  GM¬ 
RES  is  applied  in  practice  as  the  solver  for  an  inexact  Newton  method.  When 
the  preconditioning  B~x  is  of  high  quality,  the  left-preconditioned  residual  serves 
as  an  estimate  of  the  error  in  the  Newton  update  vector.  This  leads  to  a  useful 
termination  condition  when  Newton  step  acceptance  tests  are  based  on  ||du||. 

3  Compressible  Euler  Equations 

3.1  Governing  Equations 

The  non-dimensional  Euler  Equations  in  three  dimensions  for  the  dependent  vari¬ 
able  vector  Q  =  [p.  pu ,  pv,  pw ,  e]T  are 

Qt  +  F(Q)x  +  G(Q)y  +  H(Q)z  =  0,  (1) 

After  changing  the  variables  into  the  curvilinear  coordinates 

T  =  t,(  =  £(x,  y,z),  T)  =  r)(x,y,z),(= 

The  equations  are  now  expressed  in  the  strong  conservation  laws  as 

Qr  +  (F\  +  (G)„  +  {H\  =  0,  (2) 
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where  Q  and  the  contravariant  flux  vectors,  F  and  G,  are  defined  in  terms  of  the 
Cartesian  fluxes  and  the  Jacobian  determinant  of  the  coordinate  system  trans¬ 
formation,  through 

q  =  J~lQ 

F  =  j-'({tQ  +  txF  +  tyG  +  (zH) 

G  =  J-1  (r)tQ  +  tjxF  +  TjyG  +  rjzH) 

H  =  J-1  (CtQ  +  (XF  +  (yG  +  (ZH) . 


d(x,y,z,t) 

(  £,x  £y  £t 

J  ,  Vx  Vy  Vz  T)t 

det  C,  d  C.  6 

V  o  o  o  i 

(Si  dy  Sz  \ 

Vx  Vy  Vz 

Cx  (y  Cz  ) 


3.2  Finite  volume  scheme 

By  assuming  the  dependent  variables  to  be  constant  in  the  interior  of  cell(i,j,k), 
and  that  the  flux  vectors  F,  G.  and  H  are  constant  over  the  constant  £,  tj,  and  £ 
surfaces  of  the  cell,  respectively,  then  an  implicit  finite  volume  discretization  of 
equation  (1)  can  be  written  as 


(QS  -  <3”y>i)A{A,AC  +  (F’+jV,  -  F"4'iM)A, ACAt 
+(G&M  -  +  -  "S-d^Ar  =  0  (3) 

Using  the  flux  split  formula  (see  below),  the  above  equations  can  be  written 


A  Qn  +  A  t($c(F+  4-  F")n+1  +  SV{G+  +  G~)n+1  +  8C{H+  +  H~)n+1)  =  0  (4) 

where  St,  for  example,  is  defined  by 


St  -  -^[Fi+1/2,j,k  -  Fi-i/2,j,k] 
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and  6V  and  are  defined  analogously.  The  split  vector  for  F  is  given  by 

F  =  F+  +  F~,  (6) 

with  similar  expressions  for  G  and  H.  F+  is  associated  with  the  eigenvalues  that 
have  positive  signs  and  F~  is  associated  with  the  eigenvalues  that  have  negatives 
signs  (see  Steger- Warming).,  and  G+,  G~ ,  H+,  andH~  are  defined  analogously. 
The  implicit  split-flux  discretization  is  given  by 

[I  +  A  r[(8^F+  +  F~)n+1  +  6t(G+  +  G~)n+1  +  8V(H+  +  H~  )n+1] 

=  -A  r(£|F"  +  8'Gn  +  8\Hn )  (7) 

A  linearization  of  first  order  in  time  of  the  above  equation  yields 


[/  +  A  t{6\A+-  +  8\A~-  +  81B+-  +  fyB".  +  8\C+-  +  8\C--)kQn 

= -Ar{8lFn  +  8'Gn  +  8ecHn)  (8) 

Where  a  distinction  has  been  made  between  the  implicit  spatial  difference  op¬ 
erator  and  the  explicit  spatial  difference  operators  by  using  supersripts  i  and  e, 
respectively.  The  dots  indicate  that  the  difference  operators  apply  to  the  product 
of  the  Jacobian  matrices  with  A Qn.  The  matrices  A+,  A~,  B+,  B~,  C+,  andC- 
are  defined  by 


A+  = 
B+  - 
C+  = 


The  finite  volume  discretization  given  by  equation  (3)  requires  the  numeri¬ 
cal  flux  at  a  cell  face.  These  fluxes  are  computed  using  the  Roe’s  approximate 
Riemann  solver  [12].  Three  limiters  are  employed:  minmod,  Superbee,  and  Van 
Leer.  The  Jacobians  are  evaluated  using  first-order  Roe’s  scheme,  or  the  flux- 
vector  split  scheme  [14],  which  corresponds  to  the  true  partials  of  the  positive  and 
negative  flux  vectors  as  described  earlier.  However,  the  flux-vector  split  scheme 
has  been  shown  to  give  improved  convergence  rates  over  the  Roe  matrices.  Be¬ 
cause  the  Jacobian  matrices  corresponding  to  the  flux-vector  split  scheme  work 
better  on  the  left  hand  side  in  the  solution  matrix  than  the  Roe  matrices,  this 
is  the  scheme  presently  being  employed  in  the  context  of  defect  correction.  This 
results  in  inconsistent  left  and  right  hand  side  operators. 
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Remark  3.1  For  most  CFD  codes,  the  implicit  spatial  differences  are  only  first- 
order  accurate.  Because  deriving  higher-order  accuracy  is  difficult,  and  the  re¬ 
sulting  matrices  are  very  large  and  require  a  lot  of  storage,  large  operation  count 
in  its  evaluation,  and  may  be  very  difficult  to  invert. 

Remark  3.2  For  Roe’s  scheme,  it  is  difficult  to  obtain  the  true  Jacobian.  Barth 
[1],  has  obtained  such  Jacobian  in  two  dimensions.  However,  for  three  dimen¬ 
sions,  the  evaluation  of  such  Jacobian  needs  large  operation  count. 

Following  these  remarks,  the  implicit  spatial  differences  in  equation  (1)  are 
approximated,  as  mentioned  above,  through  a  first-order  accurate  scheme. 

The  explicit  spatial  differences  in  equation  (1)  are  approximated  using  the 
higher  order  formulations  of  Roe’s  scheme,  that  are  based  on  the  work  of  Osher 
and  Chakravarthy  [11]. 

3.3  Explicit  boundary  conditions 

The  boundary  conditions  are  derived  using  the  locally  one-dimensional  charac¬ 
teristic  variable  boundary  conditions,  which  yields  (for  the  calculations  see  for 
example  [10]): 

3.3.1  Farfield-Subsonic  Inflow 

Pb  =  (1/2 )Pa  +  Pi  +  sign(A lk)p0c0\kx(ua  -  uf)  +  ky(va  -  vf)  +  k2(wa  -  u?,-)] 

Pb  =  Pa  +  [{Pb  ~  Pa)/c20] 
uh  =  ua  +  kx[(Pa- Pb)l(poCo)] sign(Afc) 
vb  =  va  +  ky[(Pa- Pb)/{p0c0)] sign(Afc) 
wh  =  wa  +  kz[{Pa-  Pb)/(poCo)} sign(A^) 

Note  that  these  signs  correspond  to  the  sign  of  the  first  three  eigenvalues,  and 
hence  this  is  a  means  of  writing  the  code  for  general  applications  with  arbitrary 
orientation  of  the  computational  coordinates.  The  point  a  is  outside  the  compu¬ 
tational  domain,  point  b  is  on  the  computational  boundary,  and  i  is  inside  the 
computational  domain. 

3.3.2  Farfield-Subsonic  Outflow 

Pb  =  Pa 

Pb  —  Pa  +  [(Pb  ~  Pa)/ Cq] 

ub  =  ua  +  kx[(Pa  -  Pb)j (p0c0)]sign(Ajr.) 

Vb  =  Va  +  £y[(Pa  -  Pb)/ (/?0c0)]sign(A'fc) 
wb  =  wa  +  kz[(Pa  -  Pb)/(p0c0)\ sign(Aj() 
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3.3.3  Impermeable  Surface 


Pb  —  Pt  “I"  PoC-o 

Ub  Uf  “  kx{Jzx\lx  d-  ky'V'i'  ”1”  kgXVr^ 

vb  =  vT  —  ky{kxuT  +  kyvT  +  Jczwr ) 
wb  -  wr  —  kz(kxur  +  kyvr  +  kzwr ) 

where  the  point  r  is  the  center  of  the  first  cell  from  the  boundary  and  the  minus 
sign  in  equation  (1)  is  used  if  r  is  in  the  positive  k  direction  from  the  boundary, 
and  the  plus  sign  is  used  if  r  is  in  the  negative  direction  from  the  boundary. 

3.3.4  Farfield- Supersonic  Inflow 

In  this  case  all  eigenvalues  have  the  same  sign.  Since  we  have  an  inflow  case  all 
variables  are  specified. 

3.3.5  Farfield-Supersonic  Outflow 

In  this  case  also,  all  eigenvalues  have  the  same  sign.  But  now  we  have  an  outflow 
case,  therefore,  all  variables  must  be  obtained  from  the  solution  in  the  compu¬ 
tational  domain.  All  variables  are  extrapolated  from  inside  the  computational 
domain  to  the  boundary. 


3.4  Implicit  boundary  conditions 

In  the  implicit  form,  the  above  boundary  conditions  can  be  written  in  the  form 
of  operators  formulated  as  functions  of  the  conservation  vector  W : 


FB(W)  =  0 


and  are  implemented  implicitly  through: 


dFB 

~dW 


8W  =  -FB{W). 


Using  these  implicit  boundary  conditions,  we  show  that  starting  from  a  small 
initial  CFL  number  (10),  CFL  may  be  adaptively  advanced  according  to: 


CFLm  =  CFL'  • 


n/(«)ri 

ll/WII1 


This  is  the  key  to  the  successful  implementation  of  Newton- Krylov  matrix-free 
method  studied  in  this  paper. 
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4  Numerical  Results 


To  test  the  different  methodologies  developed  here  we  consider  a  NACA0012 
steady  transonic  airfoil  at  an  angle  of  attack  of  1.25  degrees  and  a  freestream  Mach 
number  of  0.8.  We  consider  two  meshes,  with  2048  and  4096  cells,  respectively. 
We  call  the  first  mesh,  meshl,  while  the  second  will  be  denoted  by  mesh2.  In  all 
computations  performed  herein  the  solution  obtained  agrees  with  the  standard 
one. 

The  initial  code  is  an  EAGLE-derivative  code  [10]  that  employs  the  discretiza¬ 
tion  described  in  section  3  with  explicit  boundary  conditions,  over  a  body-fitted 
grid,  and  which  uses  a  linear  solver  of  an  approximate  factorization  (AF)  type 
(see  for  example  [2]). 

We  have  implemented  implicit  boundary  conditions  as  described  in  section 
3.  We  have  also  replaced  the  (AF)  solver  by  ILU/GMRES  solver.  And  we  have 
implemented  the  Newton-Krylov  matrix-free  methods. 

We  first  compare  the  defect-correction  procedures  of  (AF)  type  and 
ILU / GMRES  with  explicit  boundary  conditions  on  the  test  case  described  above. 
Then,  we  compare  these  results  with  those  obtained  by  replacing  explicit  bound¬ 
ary  conditions  by  implicit  ones.  Finally  we  study  the  performance  of  Newton- 
Krylov  matrix-free.  All  calculations  performed  here  are  done  on  the  same  Sparc20 
machine. 


4.1  Defect-Correction  procedures:  meshl  case 

4.1.1  Explicit  boundary  conditions 

We  compare  here  the  results  obtained  using  approximate  factorization  (AF) 
method  and  ILU/GMRES  when  the  boundary  conditions  are  explicit.  We  ob¬ 
serve  that  to  reach  the  same  level  of  accuracy,  the  CPU  time  necessary  for  AF 
method  is  almost  double  the  time  necessary  to  reach  the  same  level  of  accuracy 
with  the  Krylov  method  (ILU/GMRES)  as  can  be  seen  in  figures  1  and  2,  which 
show,  respectively,  the  iteration  count  versus  the  steady  residual  norm  and  the 
CPU  time  versus  the  steady  residual  norm.  Moreover,  using  (AF)  method  the 
steady  residual  norm  cannot  be  dropped  to  the  same  final  steady  residual  norm 
reached  by  the  converged  solution  obtained  using  ILU/GMRES.  As  can  be  seen 
below,  finer  mesh  is  needed  for  the  steady  residual  norm  to  be  dropped  to  the 
same  level  as  for  the  Krylov  method,  when  AF  is  used. 

4.1.2  Implicit  boundary  conditions 

We  first  compare  different  calculations  obtained  using  different  CFL  numbers. 
The  results  are  presented  in  figures  3  and  4  where  we  show  a  comparison  of  the 
CPU  time  versus  the  steady  residual  norm.  These  calculations  are  performed 
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using  a  CFL  number  equal  to  6.5,  100,  and  500  respectively.  From  these  com¬ 
parisons  we  can  see  that  using  high  CFL  number  improves  the  convergence  rate. 
However,  when  the  CFL  number  is  larger  than  100,  no  further  improvement  can 
be  obtained.  This  is  due  to  the  mismatch  of  the  left  and  right  hand  side  oper¬ 
ators.  We  will  see  in  the  next  section  that  this  drawback  can  be  removed  using 
Newton-Krylov  methodology  described  in  section  2.  We  will  now  validate  the 
CFL  strategy  described  in  section  3.  In  figures  5  and  6  we  compare  the  calcula¬ 
tions  performed  with  a  CFL  number  of  100  to  the  calculations  performed  with 
the  CFL  strategy  described  in  section  3.  This  comparison  shows  the  validation  of 
using  these  techniques.  It  also  shows  that  the  converged  solution  is  obtained  in 
about  the  same  CPU  time.  In  figure  7,  we  show  the  CFL  versus  iteration  count. 
Now  we  show  in  figures  8  and  9,  comparisons  of  steady  state  residual  norm  ver¬ 
sus  the  iteration  count  and  CPU  time  for  the  converged  solution  obtained  using 
explicit  boundary  conditions  with  a  CFL  number  equal  to  6.5  and  using  implicit 
boundary  conditions  with  a  CFL  of  100.  We  observe  that  an  improvement  in 
terms  of  the  CPU  time  is  obtained  when  we  use  implicit  boundary  conditions  as 
compared  to  explicit  ones.  This  comparison  highlights  the  gain  obtained  using 
implicit  boundary  conditions  when  Krylov  methods  are  used  as  linear  solvers.  We 
should  notice  that  using  AF  solver,  the  implicit  boundary  conditions  do  not  im¬ 
prove  the  convergence  rate  because  the  AF  method  is  based  on  an  approximation 
of  first  order  to  the  linear  system  to  be  solved. 

4.2  Newton-Krylov  matrix-free  procedures:  meshl  case 

We  study  here  the  Newton-Krylov  matrix-free  methodology  described  in  section 
2.  The  techniques  used  in  the  choice  of  the  finite  differencing  parameter  are  de¬ 
scribed  in  section  2.  To  take  full  advantage  of  the  power  of  Newton’s  method,  and 
thus  to  allow  a  more  rapid  asymptotic  convergence  to  the  steady  state  solution,  we 
use  the  CFL  strategy  described  in  section  3,  and  validated  above.  The  ILU  pre¬ 
conditioner  we  use  here  is  formed  from  a  lower  order  discretization  and  is  exactly 
the  same  as  that  used  already  in  the  defect- correction  procedure  studied  above. 
This  results  in  a  mixed  Jacobian/Preconditioner  discretization.  More  precisely, 
the  explicitly  available  (Van  Leer)  first-order  flux  vector  split  Jacobian  (  Jvl )  is 
used  to  precondition  the  implicitly  defined  (Roe)  higher-order  flux  difference  split 
Jacobian  (Jr)  at  each  implicit  time  step.  In  matrix  terms,  the  correction  u  is 
obtained  as  the  approximate  solution  of, 

(Jvl)~1  Jru  —  (Jvl)~^  Jr- 

While  in  the  defect-correction  context,  this  correction  was  obtained  as  the 
approximate  solution  of, 

Jvlu  =  —  Jr. 

We  first  compare  the  results  obtained  using  ILU / GMRES  with  implicit  bound¬ 
ary  conditions  and  with  CFL  of  100,  and  using  Newton-Krylov  matrix-free.  The 
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results  are  presented  in  figures  10  and  11,  in  which  we  show,  respectively,  the 
steady  residual  versus  the  iteration  count  and  the  steady  residual  versus  the  CPU 
time.  We  perform  4  Newton  iterations  in  each  implicit  time  step.  The  stopping 
criterion  corresponds  to  a  steady  residual  norm  of  10-9. 

In  figures  12  and  13  we  show  a  comparison  of  the  four  methods  studied  in  this 
paper.  Clearly,  we  can  see  that  Newton-Krylov  matrix-free  outperforms  all  the 
other  three  methods. 

To  refine  our  analysis  we  have  performed  the  same  study  but  on  a  much  finer 
grid.  The  results  are  discussed  below. 

4.3  Defect-Correction  procedures:  mesh2  case 

4.3.1  Explicit  boundary  conditions 

We  compare  here  the  results  obtained  using  approximate  factorization  (AF) 
method  and  ILU/GMRES  when  the  boundary  conditions  are  explicit.  Similarly 
to  the  meshl  case,  we  observe  that  in  order  to  reach  the  same  level  of  accuracy, 
the  CPU  time  necessary  for  AF  method  is  almost  double  the  time  necessary  with 
the  Krylov  method  (ILU/GMRES)  as  can  be  seen  in  figures  14  and  15,  which 
show,  respectively,  the  iteration  count  versus  the  steady  residual  norm  and  the 
CPU  time  versus  the  steady  residual  norm. 

4.3.2  Implicit  boundary  conditions 

As  for  the  meshl  case,  we  first  compare  different  calculations  obtained  using 
different  CFL  numbers.  The  results  are  presented  in  figures  16  and  17  where  we 
show  a  comparison  of  the  steady  state  residual  norm  versus  CPU  time.  These 
calculations  are  performed  using  a  CFL  number  equal  respectively  to  5,  100,  and 
500.  These  comparisons  show  again  that  using  high  CFL  number  improves  the 
convergence  rate.  However,  when  the  CFL  number  is  larger  than  100,  no  further 
improvement  can  be  obtained.  This  is  due  to  the  mismatch  of  the  left  and  right 
hand  side  operators.  We  will  see  in  the  next  section  that  this  drawback  can 
be  removed  using  Newton-Krylov  methodology  described  in  section  2.  We  will 
now,  validate  the  CFL  strategy  described  in  section  3.  In  figures  18  and  19  we 
compare  the  calculations  performed  with  CFL  100  to  the  calculations  performed 
with  the  CFL  strategy  described  in  section  3.  This  comparison  shows  again  the 
validation  of  using  these  techniques.  It  also  shows  that  the  converged  solution 
is  obtained  in  about  the  same  CPU  time.  In  figure  20,  we  show  the  CFL  versus 
iteration  count.  Now  we  show  in  figures  21  and  22,  comparisons  of  steady  state 
residual  norm  versus  the  iteration  count  and  CPU  time  for  the  converged  solution 
obtained  using  explicit  boundary  conditions  with  a  CFL  number  equal  to  5  and 
using  implicit  boundary  conditions  with  a  CFL  number  equal  to  100.  We  observe 
that  an  improvement  in  terms  of  the  CPU  time  is  obtained  when  we  use  implicit 
boundary  conditions  as  compared  to  explicit  ones.  This  comparison  highlights 


11 


again  the  gain  obtained  using  implicit  boundary  conditions  when  Krylov  methods 
are  used  as  linear  solvers. 

4.4  Newton-Krvlov  matrix-free  procedures:  mesh2  case 

Here,  we  study  here  again  the  Newton-Krylov  matrix-free  methodology  described 
in  section  2.  The  techniques  used  in  the  choice  of  the  finite  differencing  param¬ 
eter  are  described  in  section  2.  To  take  full  advantage  of  the  power  of  Newton’s 
method,  and  thus  to  allow  a  more  rapid  asymptotic  convergence  to  the  steady 
state  solution,  we  use  the  CFL  strategy  described  in  section  3,  and  validated 
above.  The  ILU  preconditioner  we  use  here  is  formed  from  a  lower  order  dis¬ 
cretization  and  is  exactly  the  same  as  that  used  already  in  the  defect-correction 
procedure  studied  above.  This  results  in  a  mixed  Jacobian/Preconditioner  dis¬ 
cretization.  More  precisely,  the  explicitly  available  (Van  Leer)  first-order  flux 
vector  split  Jacobian  (Jvl)  is  used  to  precondition  the  implicitly  defined  (Roe) 
higher-order  flux  difference  split  Jacobian  (Jr)  at  each  implicit  time  step.  In 
matrix  terms,  the  correction  u  is  obtained  as  the  approximate  solution  of, 

(Jvl)~1Jru  =  -(Jvl)'1  Jr- 

While  in  the  defect-correction  context,  this  correction  was  obtained  as  the 
approximate  solution  of, 

Jvlu  =  —Jr- 

We  first  compare  the  results  obtained  using  ILU/GMRES  with  implicit  bound¬ 
ary  conditions  and  with  CFL  of  100,  and  using  Newton-Krylov  matrix-free.  The 
results  are  presented  in  figures  23  and  24,  in  which  we  show,  respectively,  the 
steady  residual  versus  the  iteration  count  and  the  steady  residual  versus  the  CPU 
time.  We  perform  4  Newton  iterations  in  each  implicit  time  step.  The  stopping 
criterion  corresponds  to  a  steady  residual  norm  of  10-9. 

In  figures  25  and  26  we  show  a  comparison  of  the  four  methods  studied  in 
this  paper.  The  comparison  highlights  the  efficiency  and  performance  of  the 
Newton-Krylov  matrix-free  method. 


5  Conclusions 

In  this  paper  we  have  demonstrated  the  performance  of  Krylov  based  methods. 
The  convergence  rate  is  improved  using  implicit  boundary  conditions  as  compared 
to  explicit  ones.  The  higher-order  Jacobian  needed  in  the  Newton’s  method  need 
not  be  computed  explicitly  in  the  Newton-Krylov  matrix-free  context.  The  use  of 
mixed  discretization  (higher-order)  Jacobian  /  (lower-order)  preconditioner,  re¬ 
sults  in  an  efficient  preconditioned  Newton-Krylov  matrix-free  algorithm  in  terms 
of  CPU  time  as  has  been  shown  in  the  numerical  experiments  presented  in  this 
study. 
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Figure  1:  Steady-state  residual  versus  iteration  count  for  approximate  factoriza¬ 
tion  (AF)  and  ILU/GMRES  solvers. 


Figure  2:  Steady-state  residual  versus  CPU  time  for  approximate  factorization 
(AF)  and  ILU/GMRES  solvers. 
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Figure  3:  Steady-state  residual  versus  iteration  count  for  ILU/GMRES  solver 
with  different  CFL:  6.5,  100,  and  500. 


Figure  4:  Steady-state  residual  versus  CPU  time  for  ILU / GMRES  solver  with 
different  CFL:  6.5,  100,  and  500. 
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Figure  5:  Steady-state  residual  versus  iteration  count  for  ILU/GMRES  solvers 
with  CFL  constant  equal  100,  and  with  adaptively  increasing  CFL. 


Figure  6:  Steady-state  residual  versus  CPU  time  for  ILU/GMRES  solvers  with 
CFL  constant  equal  100,  and  with  adaptively  increasing  CFL. 
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Figure  7:  CFL  versus  iteration  count  for  ILU/GMRES  solvers. 


Figure  8:  Steady  state  residual  versus  iteration  count  for  ILU/GMRES  solvers 
with  explicit  boundary  conditions  and  implicit  boundary  conditions. 
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Figure  9:  Steady  state  residual  versus  CPU  time  for  ILU/GMRES  solvers  with 
explicit  boundary  conditions  and  implicit  boundary  conditions. 


Figure  10:  Steady-state  residual  versus  iteration  count  for  defect  correction 
(ILU/GMRES)  and  Newton-Krylov  matrix-free  solvers. 
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Figure  11:  Steady-state  residual  versus  CPU  time  for  defect  correction 
(ILU/GMRES)  and  Newton-Krylov  matrix-free  solvers. 


Figure  12:  Steady-state  residual  versus  iteration  count  for  approximate  factor¬ 
ization  (AF),  ILU/GMRES  with  explict  boundary  conditions,  ILU/GMRES  with 
implicit  boundary  conditions,  and  Newton-Krylov  matrix-free  solvers. 
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Figure  13:  Steady-state  residual  versus  CPU  time  for  approximate  factorization 
(AF),  ILU/GMRES  with  explict  boundary  conditions,  ILU/GMRES  with  implicit 
boundary  conditions,  and  Newton- Krylov  matrix-free  solvers. 


Figure  14:  Steady-state  residual  versus  iteration  count  for  approximate  factor 
ization  (AF)  and  ILU/GMRES  solvers. 


Figure  15:  Steady-state  residual  versus  CPU  time  for  approximate  factorization 
(AF)  and  ILU/GMRES  solvers. 
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Figure  16:  Steady-state  residual  versus  iteration  count  for  ILU/GMRES  solver 
with  different  CFL:  5,  100,  and  500. 


Figure  17:  Steady-state  residual  versus  CPU  time  for  ILU/GMRES  solver  with 
different  CFL:  5,  100,  and  500. 


Figure  18:  Steady-state  residual  versus  iteration  count  for  ILU/GMRES  solvers 
with  CFL  constant  equal  100,  and  with  adaptively  increasing  CFL. 


Figure  19:  Steady-state  residual  versus  CPU  time  for  ILU/GMRES  solvers  with 
CFL  constant  equal  100,  and  with  adaptively  increasing  CFL. 
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Figure  21:  Steady  state  residual  versus  iteration  count  for  ILU/GMRES  solvers 
with  explicit  boundary  conditions  and  implicit  boundary  conditions. 


Figure  22:  Steady  state  residual  versus  CPU  time  for  ILU/GMRES  solvers  with 
explicit  boundary  conditions  and  implicit  boundary  conditions. 


Figure  23:  Steady-state  residual  versus  iteration  count  for  defect  correction 
(ILU /GMRES)  and  Newton-Krylov  matrix-free  solvers. 
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Figure  24:  Steady-state  residual  versus  CPU  time  for  defect  correction 
(ILU/GMRES)  and  Newton-Krylov  matrix-free  solvers. 


Figure  25:  Steady-state  residual  versus  iteration  count  for  approximate  factor¬ 
ization  (AF),  ILU/GMRES  with  explict  boundary  conditions,  ILU/GMRES  with 
implicit  boundary  conditions,  and  Newton-Krylov  matrix-free  solvers. 
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Figure  26:  Steady-state  residual  versus  CPU  time  for  approximate  factorization 
(AF),  ILU/GMRES  with  explict  boundary  conditions,  ILU/GMRES  with  implicit 
boundary  conditions,  and  Newton-Krylov  matrix-free  solvers. 
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